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ABSTRACT 

We have developed a web server, PRince, which 
analyzes the structural features and 
physicochemical properties of the protein-RNA 
interface. Users need to submit a PDB file contain- 
ing the atomic coordinates of both the protein and 
the RNA molecules in complex form (in '.pdb' 
format). They should also mention the chain identi- 
fiers of interacting protein and RNA molecules. The 
size of the protein-RNA interface is estimated by 
measuring the solvent accessible surface area 
buried in contact. For a given protein-RNA 
complex, PRince calculates structural, 
physicochemical and hydration properties of the 
interacting surfaces. All these parameters gen- 
erated by the server are presented in a tabular 
format. The interacting surfaces can also be vis- 
ualized with software plug-in like Jmol. In addition, 
the output files containing the list of the atomic co- 
ordinates of the interacting protein, RNA and inter- 
face water molecules can be downloaded. The 
parameters generated by PRince are novel, and 
users can correlate them with the experimentally 
determined biophysical and biochemical param- 
eters for better understanding the specificity of the 
protein-RNA recognition process. This server will be 
continuously upgraded to include more parameters. 
PRince is publicly accessible and free for 
use. Available at http://www.facweb.iitkgp.ernet.in/ 
~rbahadur/prince/home.html. 

INTRODUCTION 

Protein-RNA interaction is ubiquitous in many cellular 
processes. The interaction is specific in nature, and 
non-specific interaction can lead to malfunction of the 
cell. Interfaces formed by interactions between protein 
and RNA molecules provide context for understanding 
the principles of molecular recognition in vivo. Over 



the last few decades, remarkable progress has been made 
in understanding the structural and the functional 
aspect of the protein-RNA interactions from their three- 
dimensional atomic structures (1-8). The ever expanding 
Protein Data Bank (PDB) (9), which is the central reposi- 
tory of structural information of the macromolecules and 
their complexes, also helps in such understanding. 
Concurrently, there have been attempts to analyze the 
structural geometry and physicochemical properties of 
the interfaces using a number of parameters based on 
the features of the interacting surfaces (10-20). These 
analyses have been further extended to develop softwares 
or web servers or both for automatic calculations of these 
parameters (21-23). Some of these softwares and web 
servers are used in the prediction of RNA binding sites 
in proteins (24-29). Nevertheless, in spite of all these de- 
velopments, our understanding of the protein-RNA rec- 
ognition process is still not adequate enough to explain the 
structural basis of the conformational changes during the 
recognition processes (30), mechanism of sequence specific 
recognition (31-33), as well as the prediction of protein- 
RNA complexes through the docking methods (34-36). 

In recent years, we have developed several parameters 
based on the geometric structure and physicochemical 
properties of the interacting surfaces in biomolecules 
(37). We have also studied the hydration pattern of the 
interfaces, and developed parameters to investigate the 
role of interface water molecules in the recognition 
process (38). All these parameters are useful in understand- 
ing the structural specificity of the recognition process in 
protein-protein complexes, and are extensively used in 
discriminating the specific protein-protein interfaces from 
the non-specific ones (39^12). They have also been success- 
fully used to understand the specific recognition of the 
RNA molecules on the protein surface, suggesting that 
the protein-RNA recognition process involves elements 
of shape recognition as well as electrostatic interaction 
and the recognition of the base sequence (5,19). In order 
to calculate all these interface parameters for a given 
protein-RNA complex, we have automatized the method 
and implemented it into a web server named PRince 
(Protein-RNA interface; http://www.facweb.iitkgp.emet. 
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in/~rbahadur/prince/home.html). This article describes the 
development of the web server PRince. 

PROGRAM DESCRIPTION 

Input file and format 

This server allows the users to submit a protein-RNA 
complex file or a dataset of protein-RNA complexes in 
the PDB format containing the protein and the RNA 
chains. Users must also indicate the chain identifiers for 
each of the protein and the RNA unit (a maximum of 
eight protein chains and eight RNA chains are allowed). 
The server can handle up to 20 000 atoms for each of the 
protein and the RNA chain. The detailed information 
about the user submitted PDB files and chain identifiers 
for the protein and the RNA will be displayed on the 
server page once the calculations are completed. 

Output files and parameters 

The server generates four types of output files in PDB 
format: (i) list of the interface amino acids with their 
atomic coordinates (lDFU_P.int); (ii) list of the interface 
nucleotides with their atomic coordinates (lDFU_R.int); 
(iii) list of the interface atoms on the surface of the inter- 
acting protein-RNA complex (lDFU_int.pdb); (iv) list of 
the interface water molecules along with the interface 
protein and RNA atoms (lDFU_wat.ini). In all these 
files, 1DFU, the PDB identifier for a given protein- 
RNA complex, has been used as an example. These files 
do not contain occupancy factor and B-factor columns; 
instead they have three columns, in which the solvent ac- 
cessible surface areas (SASAs) of the constituent atoms in 
the individual subunit, in the complex and their difference, 
are provided. Users can download these output files for 
further calculations; for example, to calculate the interface 
area contributed by individual residues, or to find out the 
important water molecules at the binding surface. In 
addition, they can also display these files by Jmol plugin. 
Figure 1 shows interface atoms, surface atoms and inter- 
face water molecules for the protein and the RNA 
summits involved in a complex formed by ribosomal 
protein L25 and 5S rRNA fragment (PDB id 1DFU). 
Beside these downloadable files, the server also generates 
a downloadable table with the statistics of the interface 
parameters (lDFU_param.txt), which are discussed 
below. 

Statistics of the interface parameters 

The size of the protein-RNA interface is estimated by 
calculating the interface area (B). It is calculated in 
terms of the SASA of the protein and the RNA molecules 
and is given by the following equation 

B = SASAp ro tein + SASArna — SASA com pi ex (1) 

SASAp rote i n and SASA RNA are the SASA of two interacting 
molecules and SASA complex is the same of the complex. 
The interface area B is the area of the protein and the 
RNA solvent accessible surfaces that becomes buried 
when two molecules associate. The SASA values are 




Figure 1. Graphical representation of the protein-RNA interface 
formed by ribosomal protein L25 and 5S rRNA fragment (PDB id, 
1DFU). (A) Interface region along with the water molecules. Atoms 
belong to protein, RNA and interface water molecules are colored blue, 
red and cyan, respectively. (B) Surface region of the protein and RNA 
are colored in blue and red, respectively, while their interface region is 
colored in green. 



calculated from the atomic coordinates by rolling a 
solvent probe (with the radius of a water molecule) over 
the surfaces of the protein and the RNA molecules using 
the program NACCESS (43), which implements the Lee 
and Richards (44) algorithm. All the atoms (belongs to 
amino acid residues and nucleotides) that loose solvent 
accessibility in the complex and contribute to B are con- 
sidered as interface atoms. The ratio of the interface area to 
the rest of the surface area is calculated for the individual 
protein and RNA molecules as well as for the whole 
complex. 

The chemical composition of the interface or the solvent 
accessible surface is estimated by measuring the contribu- 
tion of the different atom types to the respective interface 
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area and solvent accessible surface area (SASA), and is 
calculated by the following equation: 

f comp = interface (or solvent accessible surface) 

area contributed by a particular atom type/total 
interface (or solvent accessible surface) area 

(2) 

The composition is divided into four different types: 
(i) nonpolar, (ii) neutral polar, (hi) negatively charged and 
(iv) positively charged. At the protein surface, all the 
carbon-containing groups are considered as nonpolar; O, 
N and S are considered as neutral polar; N is positively 
charged in Arg/Lys side chains; O is negatively charged 
in Asp/Glu side chains. At the RNA surface, all the 
carbon-containing groups are considered as non-polar; 
N and O are neutral polar except OIP and 02P, which are 
considered as negatively charged (19). 

Depending on their spatial distribution, the interface 
atoms can be divided into two different categories. 
Those which are not accessible by any solvent molecules 
are called fully buried interface atoms, and those which 
are partly accessible to solvent molecules are called par- 
tially buried interface atoms (39). The atomic packing of 
the interface is quantified by the following equation 

f bu = number of fully buried interface atoms/total 
number of interface atoms 

This fraction will be higher for a closely packed interface 
compared to a loosely packed interface. 

Local atomic density (LD) index is used to measure the 
overall density of the interface as described by Bahadur 
et al. (39). In brief, for each interface atom i, the number n, 
of the interface atoms that are within a distance 12 A of 
atom i in the same subunit is counted. LD is the average 
of iti over all N interface atoms and is given by the fol- 
lowing equation 

LD = £.(h,-)/N (4) 

LD measures the packing density at each point of the 
interface. 

Polar interactions made by the amino acids and the 
nucleotides are expressed by the intermolecular hydrogen 
bonds (H-bonds) between protein and RNA. Water mol- 
ecules at the interface play an important role in stabilizing 
the protein-RNA interface by making polar interactions. 
A water molecule is selected as interface water if it is 
within 4.5 A distance from at least one atom of both 
protein and RNA chains (38). Direct protein-RNA inter- 
molecular H-bonds as well as water mediated H-bonds 
across the interface are calculated by the program 
HBPLUS with default parameters (45). All these interface 
parameters generated by the server for a protein-RNA 
complex formed by ribosomal protein L25 and 5S 
rRNA fragment (PDB id, 1DFU) are given in Table 1. 

PROGRAM IMPLEMENTATION 

The server runs on a 3.0 GHz Xeon processor with Linux 
operating system. The programs for calculating the 



Table 1. Interface parameters for a protein-RNA complex formed by 
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"Percentage compositions are calculated using Equation (2) described in 
the text. 

b The bridging waters are identified as those interface waters making 
H-bond with both protein and RNA atoms. 



parameters are written in 'C programming language and 
the web interface has been developed using Javascript and 
PHP. The server generally takes 30 s to print the output 
data for an average size protein-RNA complex, and it can 
take around a minute for a large complex. Users must 
install a java plugin or Java Runtime Environment 
(JRE) for the browser to view the 3D structures of the 
interfaces in Jmol. This web server is best viewed 
in Mozilla Firefox, Google Chrome, Safari or Opera, 
and it may run slow on the older versions of Internet 
Explorer. In addition, we have provided a non-redundant 
dataset of 8 1 protein-RNA complexes and their interfaces 
compiled by Bahadur et al. (19), which can be downloaded 
from this server website. 



CONCLUSION 

We have developed a web server PRince, to analyze the 
structural and physicochemical properties of the protein- 
RNA interfaces. Users can submit a protein-RNA 
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complex file with a list of interacting protein and RNA 
chains. The server generates several parameters describing 
the structural specificity of the interaction. These param- 
eters could be used for further analysis, and users can 
correlate them with the experimentally determined bio- 
physical and biochemical parameters for better under- 
standing the specificity of the protein-RNA recognition 
process. This server will be continuously upgraded to 
include more parameters. PRince is free and open to all 
and there is no login requirement for the users. 
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